A Gibbs sampler for the identification of gene expression and network connectivity consistency
نویسندگان
چکیده
MOTIVATION Data from DNA microarrays and ChIP-chip binding assays often form the basis of transcriptional regulatory analyses. However, experimental noise in both data types combined with environmental dependence and uncorrelation between binding and regulation in ChIP-chip binding data complicate analyses that utilize these complimentary data sources. Therefore, to minimize the impact of these inaccuracies on transcription analyses it is desirable to identify instances of gene expression-ChIP-chip agreement, under the premise that inaccuracies are less likely to be present when separate data sources corroborate each other. Current methods for such identification either make key assumptions that limit their applicability and/or yield high false positive and false negative rates. The goal of this work was to develop a method with a minimal amount of assumptions, and thus widely applicable, that can identify agreement between gene expression and ChIP-chip data at a higher confidence level than current methods. RESULTS We demonstrate in Saccharomyces cerevisiae that currently available ChIP-chip binding data explain microarray data from a variety of environments only as well as randomized networks with the same connectivity density. This suggests a high degree of inconsistency between the two data types and illustrates the need for a method that can identify consistency between the two data sources. Here we have developed a Gibbs sampling technique to identify genes whose expression and ChIP-chip binding data are mutually consistent. Compared to current methods that could perform the same task, the Gibbs sampling method developed here exceeds their ability at high levels (>50%) of transcription network and gene expression error, while performing similarly at lower levels. Using this technique, we show that on average 73% more gene expression features can be captured per gene as compared to the unfiltered use of gene expression and ChIP-chip-derived network connectivity data. It is important to note that the method described here can be generalized to other transcription connectivity data (e.g. sequence analysis, etc.). AVAILABILITY Our algorithm is available on request from the authors and soon to be posted on the web. See author's homepage for details, http://www.seas.ucla.edu/~liaoj/
منابع مشابه
In silico identification of miRNAs and their target genes and analysis of gene co-expression network in saffron (Crocus sativus L.) stigma
As an aromatic and colorful plant of substantive taste, saffron (Crocus sativus L.) owes such properties of matter to growing class of the secondary metabolites derived from the carotenoids, apocarotenoids. Regarding the critical role of microRNAs in secondary metabolic synthesis and the limited number of identified miRNAs in C. sativus, on the other hand, one may see the point how the characte...
متن کاملConnectivity as a Measure of Power System Integrity
Measures of network structural integrity useful in the analysis and synthesis of power systems are discussed. Signal flow methodology is applied to derive an expression for the paths between sources and sinks in a power network. Connectivity and reach ability properties of the network are obtained using the minors of a modified connectivity matrix. Node-connectivity, branch connectivity and mix...
متن کاملIdentification of key genes and pathways involved in vitiligo vulgaris by gene network analysis
Background and Aim: Vitiligo vulgaris is an acquired, chronic skin and hair condition characterized clinically by loss of melanin, which, if untreated, is typically progressive and irreversible. The aim of the present study was to identify potential genes involved in the pathogenesis of vitiligo. Methods: One dataset of mRNA expression in patients with vitiligo (GSE65127) were obtained from ...
متن کاملComparison of Maximum Likelihood Estimation and Bayesian with Generalized Gibbs Sampling for Ordinal Regression Analysis of Ovarian Hyperstimulation Syndrome
Background and Objectives: Analysis of ordinal data outcomes could lead to bias estimates and large variance in sparse one. The objective of this study is to compare parameter estimates of an ordinal regression model under maximum likelihood and Bayesian framework with generalized Gibbs sampling. The models were used to analyze ovarian hyperstimulation syndrome data. Methods: This study use...
متن کاملIdentification of Prognostic Genes in Her2-enriched Breast Cancer by Gene Co-Expression Net-work Analysis
Introduction: HER2-enriched subtype of breast cancer has a worse prognosis than luminal subtypes. Recently, the discovery of targeted therapies in other groups of breast cancer has increased patient survival. The aim of this study was to identify genes that affect the overall survival of this group of patients based on a systems biology approach. Methods: Gene expression data and clinical infor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 22 24 شماره
صفحات -
تاریخ انتشار 2006